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Abstract 



We show that, for each value of a G (—1, 1), the only Riemannian 
metrics on the space of positive definite matrices for which the V^"^ 

■ and V*-""-* connections are mutually dual are matrix multiples of the 

Wigner-Yanase-Dyson metric. If we further impose that the metric be 

■ monotone, then this set is reduced to scalar multiples of the Wigner- 
, Yanase-Dyson metric. 

> '. 1 Introduction 

■ Classical information geometry addresses the differential geometric prop- 
5^ I erties of families of classical probability densities. Quantum information 

geometry is its noncommutative counterpart, dealing with the geometric 
structure of families of quantum probabilities. The classical theory has 
been already explored and extended substantially, to the point of treating 
the geometric structures of the infinite dimensional Banach manifold of all 



probability measures equivalent to a given one 1 28 , |9|] . All the ingredients of 
the original Amari's theory |l], ^, such as the Fisher metric, the exponen- 
tial, mixture and a-connections, have been defined for this general manifold, 
from which the finite dimensional results follow by restricting them to its 
finite dimensional submanifolds [10|. In comparison, the quantum version 
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still has "miles to go before sleep" Q, being so far mostly restricted to the 
geometry of density matrices on finite dimensional Hilbert spaces. It stands 
as a proof of the richness of the quantum domain that even this limited 
setup already offers many challenging problems, completely absent in the 
classical case. 

A central theme in the passage from classical to quantum information 
geometry is the breakdown of Chentsov's result |^] that the Fisher metric is 
the unique Riemannian metric (up to scalar multiples) on finite dimensional 
classical information manifolds which is reduced by all Markov morphisms. 
As proved by Petz |25|, there are infinitely many Riemannian metrics on a 
matrix space with the property of being reduced by stochastic maps (the 
quantum analogue of Markov morphisms). Having characterized all these 
possible monotone metrics in terms of operator monotone functions, Petz's 
result opened the way to two different trends: to deal with the whole set 
of monotone metrics at once and try to find yet other characterizations 
[p2| , ^, 26 1 or to find out which among them are more natural then the 
others according to properties beyond monotonicity |17, 35 1. This paper 
is dedicated to the second of these trends. Its general attitude could be 
rephrase as: if monotonicity is not enough to single out one particular metric, 
what are the other conditions that should be further imposed in order to 
obtain a unique metric on the information manifolds of density matrices ? 
The answer we offer is based on the concept of duality for affine connections 
with respect to a given metric. 

There are two flat connections that can be introduced on information 
manifolds in a fundamental way: the mixture connection, coming from the 
linear structure of the manifold itself (either as a subset of in the clas- 
sical case or as a subset of the trace class operators in the quantum case), 
and the exponential connection, coming from the linear structure of their 
logarithms. The former, denoted by V^~^^ or V^™-*, arises naturally when 
we consider mixed states (classical or quantum), whereas the latter, denoted 
by V^^^ or V'-'^-', is intimately related to the concepts of moment generating 
functionals and partition functions. For infinite dimensional classical mani- 
folds, the exponential connection was rigorously defined in making use of 
exponential Orlicz spaces, while the mixture connection is similarly defined 
in based on the conjugate Orlicz space of type LlogL. Of course the 
nonparametric definitions are designed in such a way that when restricted to 
finite dimensional submanifolds they reduce to the long standing definitions 
of the parametric theory . For infinite dimensional quantum information 
manifolds, the exponential connection was obtained in |33, 32, 12], using the 
technique of small perturbations of forms and operators in Hilbert spaces, 
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but the mixture connection poses a much harder problem, which is to some 
extent stiU open |^^. Fortunately, the situation is straightforward as far 
as finite dimensional quantum systems are concerned. Many authors have 
proposed essentially equivalent definitions for the exponential and mixture 
connections on manifolds of density matrices [|l5|, 24, 21 1. We summarize 



our views on these definitions for V^^^ and V^"^^ in [13|, where we observed 
that they are flat connections by explicitly constructing affine coordinate 
systems for each of them. 

Two connections are said to be dual with respect to a metric if the com- 
bined action of their parallel transport is compatible with the metric (see 
section 3 below for the technical deflnition). The same pair of connections 
can be dual with respect to a multitude of metrics. It is then meaning- 
ful to ask, for a given pair of connections, what are the all the possible 
metrics that make them dual. When we looked at the mixture and the ex- 
ponential connection on flnite dimensional quantum systems, we found in 
[p!3| that the only metrics with this duality property are matrix multiples 
of the Bogoliubov-Kubo-Mori inner product. Using Petz's characterization, 
we then obtained the improved result that the only monotone metrics which 
make the itl-connections dual are sm/ar multiples of the BKM metnc. The 
purpose of the present paper is to investigate the same kind of question for 
the more general pairs of ita-connections. 

In the classical version of Information Geometry, there are two equivalent 
ways of deflning the a-connections V*^"-* on an information manifold A4, for 
a £ (0,1). The flrst approach consists of using the a-embeddings of the 
form p 1-^ ■j^P^' to map A4 into the sphere of radius r in the Banach 
space L^, for r = One then looks at the natural connection on L*", that 
is, the one for which the parallel transport is just the identity map, and its 
canonical projection onto the sphere of radius r. The pullback of the latter 
(again using the a-embedding) is then deflned to be the a-connection on 
A4. For finite dimensional manifolds, this can be traced back to the early 
works of Amari [Q] and Q, where they are introduced without explicitly 
mention of what the target spaces for the a-embeddings should be. For 
infinite dimensional information manifolds, one has to explicitly make use 
of the functional analytic properties of the spaces L** (namely that they are 
locally convex spaces), in order to unequivocally define what is meant by 
the canonical projection onto a sphere. This was done in detail for the first 
time in and in a slightly different fashion in |11]. In any event, one can 
prove that 

vH = i±^V« + i^v(-^), (1) 
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which can then be taken as an equivalent definition for V*-"\ Proposals 
for the quantum analogues of a-connections, for both finite and infinite 
dimensional manifolds, have appeared in number of papers ^. They 

all use the a-embeddings in one way or another. We present them in section 
2, where we review some of their most relevant properties. As it turns out, 
the a-embedding definitions are no longer equivalent to (|l|), that is, to the 
definition based on the convex mixture of the zbl-connections. We shall have 
more to say about this point later on in the paper. 

As it is well known, the BKM metric is a limiting case of the more 
general family of Wigner-Yanase-Dyson metrics, denoted by (more about 
this notation later). The WYD metrics made their first appearance in the 
context of quantum information geometry in the work of Hasegawa |jl4|] . It 
was later proved that they are monotone for all values of a G [—3, 3] I'^ 
the spirit of the a-embeddings discussed above, for which the target spaces 
are U , with r = jz^, we restrict our discussion to the range a E (—1,1), 
thus corresponding to r G (l,oo). It is straightforward to prove that, for 
each fixed value of a in this range, the ita-connections are dual with respect 
to the metric |15]. The formal limits a — > ±1 lead to the ^ii'M metric and 
the exponential and mixture connections, for which the duality is established 
separately [24|. 

Following the same technique of [13|, we obtain the converse of this 
result. We find in section 3 that, for each fixed value of a € (—1,1), the 
only metrics for which V*-"^ and V^""-* are dual are matrix multiples of the 
WYD metric g'". Using Petz's characterization, we obtain in section 4 that 
the only monotone metrics on positive definite matrices which make the 
iba-connections dual are scalar multiples of (7° . 



2 The quantum a-connections 
2.1 The a-representation 

Following the notation in ||l^, let Ti^ be a finite dimensional complex 
Hilbert space, B{7i^) the algebra of operators on Ti^ , A its A^^-dimensional 
real vector subspace of self-adjoint operators and M. the n-dimensional sub- 
manifold of all invertible density operators on 7Y^, with n = N'^ — 1. For 
a G (—1, 1), define the a-embedding of M. into A as 

■■ M^A 

2 l-g 
P ^ P 2 • 

1 — a 
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Since A is itself a vector space, its tangent vectors consist of the partial 
derivatives of curves in A. Therefore we can use the a-embedding to obtain 
an explicit representation of the tangent bundle of A4 in terms of operators 
in A, provided we can efficiently take partial derivatives of functions of 
operators in A. The noncommutative nature of quantum manifolds makes 
a full appearance at this point, since the derivative of a matrix with respect 
its parameters does not necessarily commute with the original matrix. As 
a result, tools such as the chain rule do not hold in matrix calculus. To 
overcome this difficulty, at least for functions of density matrices, we make 
use of the following decomposition. In the sequel, for A € B{Tl^), let 
C{A) ={B € B{n^) : [A,B] = 0} denote its commutant. 

Lemma 2.1 (Hasegawa, 1997) Let S = p{9) be a smooth manifold of 
invertible density matrices. Then there exist a anti- self adjoint operator Aj 
such that 

^ = |^ + [,,A,], ^^C{p), [p,A,]EC(p)^ (2) 

the orthogonality being with respect to the Hilbert- Schmidt inner product in 
B{H^). Moreover, for any function F which is differentiable on a neigh- 
bourhood of the spectrum of p we have 

^ = ^ + ^eCW, |F(p),A.l.e(p)M3) 



At each point p G ^A, consider the subspace of A defined by 
= 1^ G ^ : Tr (^p^A^ = o} . 
Using (H) with F{p) = ia{p), we obtain 



Vet 



(p) 1-ad'^lOgP 2 , 1-a 



Therefore, it follows from the normalization condition Trp = 1 and the 
cyclicity of the trace that 



so that ^ 
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We can then define the isomorphism 

v^{£aOjy{0), (5) 

where 7 : (— e, e) ^ is a curve in the equivalence class of the tan- 
gent vector V. We call this isomorphism the a-representation of the tan- 
gent space TpAi. If [9^,..., 9^) is a coordinate system for then the 

a-representation of the basis • • • , gf^r} of TpM. is |^^t^) • • • > ^gglf ^ | ■ 
The a-representation of a vector field X on is therefore the ^-valued 
function given by = {ia)*ip)Xp. 

2.2 The covariant derivative V^"^ 

The ibl-connections have a simple definition in terms of their parallel trans- 
ports, essentially because the itl-embeddings map Ai into sets with an affine 
structure (the density operators themselves in the — 1-embedding and their 
logarithms in the 1-embedding). Once their (fiat) parallel transports are 
defined, it is then a simple matter to find the coefficients of their covariant 
derivatives, as well as to exhibit affine coordinate systems for them, as ex- 



plained for instance in the second section of |13|. However, as noted in the 
introduction, the a-embeddings can be viewed as a map from A4 into the 
positive orthant of the sphere of radius r = in A when we equip A with 
the the r-norm 

Pll^ := (Tr|A^)^/^ 
Indeed, we can readily verify that, for any p G Ai, 

l/r 



\\Up)\\r=(Tr 



l/r 

rp ' 



so that ia{p) £ S^, the sphere of radius r in A. More interestingly, it can 
be shown that the tangent space at a point < a £ is 

T^S'' = {A(£A: Tr(Aa'-^) = O} 

(see the second section of |^] for a quick review of the geometry of spheres 
in the more general context of uniformly convex Banach spaces) . If we put 
o" = ^a{p) = rp^/^ , we find that 

T^^yrS^ = {AeA: Ti{Ap^-^/') = 0} = A^p'^l 
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so that the a-representation (^) is indeed an isomorphism between tangent 
spaces, as the push-forward notation suggests. 

The sphere 5"" inherits a natural connection obtained by projecting the 
trivial connection on A (the one where parallel transport is just the identity 
map) onto its tangent space at each point. For each < a S'' , the 
canonical projection from the tangent space TuA onto the tangent space 
Tg-S"" is uniquely given by Q 



T^A ^ T^S' 



A^A 

For a = la{p) = rp^/"^ , this gives 



^Tr [Act'-1])ct. 



A^ A 



Tr 



p 2 A 



1-a 
p 2 . 



We can now define the covariant derivative of the a-connection. Starting 
with a differentiable vector field s € S{TM.), we first push it forward under 
the a-embedding along a curve 7 to obtain {^a)*{-i{t))S € TA. We then take 
its covariant derivative with respect to the trivial connection on A, denoted 
by V, in the direction of {(-a)*{p)V-, that is, the push-forward of a tangent 
vector V G TpAi. The result is a vector in Tj.pi/rA, which we then project 
down to T^pi/rS'^ using the operator H^pi/r above. Finally, we pull it back to 
using {£a)^^^p-^ Slid call it the a-covariant derivative of the vector field 
s in the direction of the tangent vector v at the point p & A4. The formula 
for all these operations reads like the following. 



Definition 1 For a G (—1, 1), let 7 : (— e,e) ^ A4 be a smooth curve such 
that p = 7(0) and v = 7(0) and let s G S{TA4) be a differentiable vector 
field. The a-connection on TA4 is given by 



V 



ip) = (4 



-1 
*(p) 



n 



(6) 



Using the definition (^), we find that the a-representation of the a- 
covariant derivative of the vector field d/dO^ in the direction of the tangent 
vector di := d/dO'^ is 



V^"^— = _ Tr p^ ""^ p^. (7) 



1- 
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2.3 The a-parallel transport and the extend manifold Ai 

The a-parallel transport of a tangent vector from tangent spaces at different 
points in A4 is tlie pull-back of the parallel transport of its a-representation 
in A. The latter, by its turn, consists of identity map followed by the 
canonical projection onto the TS^ at all points along a curve on S'^ . It is 
obviously path dependent, and therefore no longer flat, like the itl-parallel 
transports were. This is a consequence of the fact that among all L^-spaces, 
for 1 < p < oo, only the spaces and have spheres which are flat with 
respect to their trivial connections (recall the shape of the unit circles in 
for all the different L^-norms). 

Now let us consider the extended manifold of faithful weights A4 (the 
positive definite matrices). Observe first that the a-embedding in this case 
maps M to itself. Moreover, for any a £ M, T^M. = TfjA ~ A, so that 
there is no need to do any projection in order to obtain the parallel transport 
on M induced by the a-embedding. We can therefore define the a-parallel 
transport on M. simply by 

■ T M M 



and we find (using without the projection step) that the a-representation 
of its covariant derivative is 

where 9 = {6^ , . . . , 0"^^} is any coordinate system for the extended manifold 
M.. Now let {Xi, . . . be a basis for A. For each o" G we have 

that a 2 g A^ so that there exist real numbers ^ = , . . . \ such 
that 

-^a'-^ = + ■■■ + r+'^n+i. 
1 — a 

Then ^ = {^^, . . . , i^""''^} is a V^"^-afRne coordinate system for A4, since (^) 
gives 

Therefore, M is V^^^-flat, even though its submanifold M is not V^"^- 
flat. We note in passing that the connection V'^'^-' on the submanifold M. is 
a restriction of the connection V^"\ which acts on the larger manifold M.^ 
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obtained without the use of any metric on A4 , but rather using the canonical 
projection existing in A, the target space for the a-embedding. 

We finish this section with a couple of comparative remarks. Definition 
m is the verbatim analogue for finite dimensional quantum systems of the 
general definition for a-connections for infinite dimensional classical infor- 
mation manifolds jo], 0] and are, consequently, the quantum analogue of 
the original definition by Amari Q and Chentsov as well. Formulae (0) 
and (|8|) are special cases of those obtained by Jencova using an embedding 
by a more general monotone function g, which include the a-embeddings 
(see respectively line 3, page 150 and line 10, page 149 of ||2^). Finally, 
quantum a-connection in the spirit we present here had been hinted before 
by Hasegawa in [15, equation 35] and [^, equation 16], although in the 
less general form of Christoffel's symbols, which depend on a metric to be 
defined, as opposed to covariant derivatives and parallel transports, which 
are therefore more intrinsic. Infinite dimensional quantum a-connections 
were proposed in making heavy use of the geometry of uniformly con- 
vex Banach spaces, of which the definitions given here are concrete finite 
dimensional realizations. 



3 Duality and the WYD metrics 

We recall some purely geometrical definitions of duality, which apply to any 
statistical manifold, classical or quantum: dual affine connections and dual 
coordinate systems. 

Two connections V and V* on a Riemannian manifold {Ai,g) are dual 
with respect to g if and only if 

Xg{Y,Z)=g{VxY,Z) + g{Y,\/*xZ), (9) 

for any vector fields X,Y, Z on Ai fl], |2^. Equivalently, if r^(() and r*^^^ 
are the respective parallel transports along a curve {7(i)}o<t<i on ^A, with 
7(0) = p, then V and V* are dual with respect to g if and only if for all 

t e [0,1], 

9p{y, Z) = g^^t) {j-i{t)Y. '^*{t)^) ■ (10) 

Two coordinate systems 6 = {6^) and rj = (r^j) on a Riemannian manifold 
{A4,g) are dual with respect to g if and only if their natural bases for TpAi 
are hiorthogonal at every point p € AI, that is, 

(d_ _d_\ ^ , 
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Equivalently, 6 = {9^) and rj = (rji) are dual with respect to g if and only if 



9ii = and o = —-^ 

at every point p G A4, where, as usual, g^^ = {gij)~^. 

The next two theorems establishes the role of potential functions as well 
as the relation between dual connections and dual coordinate systems for 
the case of flat manifolds. In the sense used in this paper, a connection V 
on manifold ^A is said to be flat if M admits a global V-affine coordinate 
system. This is equivalent to its curvature and torsion both being zero. 

Theorem 3.1 (Amari, 1985) When a Riemannian manifold {A4,g) has 
a pair of dual coordinate systems {0,rj), there exist potential functions ^{9) 
and $(?7) such that 

Conversely, when either potential function ^ or ^ exists from which the 
metric is derived by differentiating it twice, there exist a pair of dual coordi- 
nate systems. The dual coordinate systems and the potential functions are 
related by the following Legendre transforms 

drji ' ' 89'^ 

and 

■^{9) + $(?7) - 9% = 



Theorem 3.2 (Amari, 1985) Suppose that V and V* are two fiat con- 
nections on a manifold M. ■ If they are dual with respect to a Riemannian 
metric g on M., then there exists a pair {9,r]) of dual coordinate systems 
such that 9 is V-affine and rj is a V*-affine. 

Let us now consider the definition of a Riemannian metric for our mani- 
fold A4 of density matrices. Using the a-representation to obtain a concrete 
realization of tangent vectors on A4 in terms of operators in A, a Rieman- 
nian metric on Ai is deemed to be provided by the smooth assignment of 
an inner product {■,-)p in .4 C Biji,^) for each point p ^ M. 
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For a fixed a G (—1, 1), the WYD (Wigner-Yanase-Dyson) metric on Ai 
is given by 

:=Tr(^(")5(-")) , A,BeTpM. (11) 



The symmetry properties of this definition are more apparent if one 
express it in a coordinate system {0^, . . . ,9^) for A4. By virtue of the de- 
composition lemma 2.1, we have that 



Tr p 



Tr 



_d_ _d_ 
WW 
log p log p 



(12) 



d9^ 



d9i 









l + a 


/ 1 — a'^ 


P 2 ,Ai 




p 2 



It is then clear that g, 



(a) 



(-a) 



Observe also that for the 



extreme cases a ^ ±1, formula (11) leads to the familiar iJi^TM (Bogoliubov- 
Kubo-Mori) metric 



(13) 



where A^^^\ B^^^^ are the ztl-representations of the tangent vectors A,B^ 
TpM, as explained, for instance, in ||l^. In coordinates, the iJiTM metric 
assumes the form 



9ij[9) ■■= 9p [-q^,-qqJ 



Tr 



d log p dp 
d9' dOi 



Tr|.^^|+TV[logp,A.l|p,Ajl. (14) 



89^ d9j 
It follows directly from the definition 



TTf), as has been observed in a 



number of papers [16, 21 1, that the zbo-connections are dual with respect to 
the metric g'^"^ for each fixed value of a € (—1, 1) (just as the itl-connections 
are dual with respect to the SiTM metric). Our purpose is to discover what 
other metrics have the same property. 



As suggested by the statement in theorem |3.2| , most of the ingredients of 
Amari's theory, such as statistical divergences and the projection theorems 

pp. 84-93], can only be a priori defined for flat manifolds. Only in 
a later stage, one consider what happens when they are applied to curved 
submanifolds of flat manifolds. Following this trend, we from now on conflne 
our attention to those metrics on M which are obtained as restrictions of 
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metrics on the extended manifold A4, which is V^^°^-flat, and treat the 
latter as our primary objects 

Observe first that the WYD metric extends quite naturally to A4 , simply 
using the ita-representations of tangent vectors A, B (that is, the represen- 
tation induced by the ±a-embedding of M. into A): 

(a, b) := Ti , A, Be tM- (15) 



It is also obvious that ^("^ has the same symmetry and duality properties 
of c/("). We now show how ^^"^ can be obtained from a potential function 
on M. 

Lemma 3.3 // {6^ , . . . , 0"+^) is a V^"^ -affine coordinate system for the ex- 
tended manifold Ai, then the function 

^a{0) = -^Tra{e), a{e) € M (16) 

1 + Q 



satisfies 



Moreover, 



- -Wdef- ^^^^ 



m = ^ (18) 



IS 



a V*- '^^ -affine coordinate system for A4. 



Proof: Since 6 is V*-"^-affine, there exist linearly independent operators 
{Xi, . . . , Xn+i} such that 

W = -^a"-^ = 9^X^ + ■■■ + 9^+^Xn+i. (19) 
1 — a 

Since the point a € is fixed in the course of this proof, we omit it from 
the notation and just write la and i-a for ^q.((t) and £_q,((t), respectively. 
From lemma 2.1 we obtain that 

that is 

^ = X. + [A„^J. (21) 
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Also, since 



2 i+o 
-a 2 



we have that 



1 + a 



d9i 



1 + a 



l + a 
1 - q\ 



So using lemma 2.1 again we get 



1 — Q! \ 2q 



Now observe that 



de^dQj \ l + a 
2 



Trcj 



2 ^ / 5V 
-Tr 



l + a 

1 — a 
2 

1 — a 



-Tr 



Q 1 — a 1 + Q 

(T 2 (7 2 



l + a \ de^dQj 



Ir 



Tr 



X-f -4- P — 



Tr ( + + i^^h^) 



09' 



(22) 



(23) 



Let us now evaluate each of the terms in the last expression separately. For 
the first one we have 



Tr X. 



Tr I X 
Tr 



i 7 
QC 



QC 



+ [Ajja] 



89' 



86^ 89' 



(24) 
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where we have used that facts that [A, Aj] = for any constant (independent 
of 6^) operator A and Tr I [Aj,ia]—Qgr^ ) = 0, since commutes with ia- 



Exchanging the roles of the indices i and j in (^) we find that the second 
term in ( p3[ ) gives 

But 

d^la dH-o, d" log a 9^ log a dH_^ d^ia 
a- 



Therefore 



As for the third term in 

TV e^^^^r^ = Tr ' " 



2a 



J J 



) 2q 
1-Q 1+^ 



Now we use lemma ^Jj once more in 

which inserted back in the last equation gives 



de^de^ J [V 2 y \i-a oe^ ^ ' 7 56*^ 

-TrH^^ (27) 



I- a \ d9' dOo 
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Collecting together (p4),(pq) and (p7|) we conclude that 



Tr 



QC 



(28) 



On the other hand, by the same argument used to find (24), we have 
that the WYD in this V"-affine coordinate system assumes the form 



dL 



d9' 



TV 



89' d9j 



(29) 



which proves the first assertion of the lemma. For the second part of the 
lemma, we have seen in the previous section that there exists a V~"-affine 
coordinate system ^ = {^i, . . . , S,n+i} in terms of which we can write 

for some other set of linearly independent operators {Y^, . . . ,Y^~^^}. Now 
following the same reasoning that led to ( [2^ ) we obtain that 



d^a{9) 


1 


— a 


d9' 




2 




1 


— a 






2 




1 


— a 






2 



Tr 



89' 

Tt[Xi£a+£g- 



89' 



Tr 



1 + 



1 + a 



1 — a ^ 

Tr[Xi(eiy^ + --- + en+i5^"+')] 

eiTV (X,yi) + • • • + 4+iTr (x,y"+i) 

n+1 

Y,'^,{X,Y^)i,. 



(30) 
(31) 



This means that the coordinate system [Tj) is affinely related to (^) and 
therefore it is itself V~'^-affine. 



We end this section with the next theorem, which is the extension for a 
general a-connections of the result proved in for the case a = ±1. 

Theorem 3.4 For a fixed value of a G (—1, 1), suppose that the connections 
are dual with respect to a Riemannian metric g on Ai. Then 



15 



there exist a constant (independent of a) (n + 1) x (n + 1) matrix M , such 

n+l 

that {jjtj)ij = ''y^^^i^a^^)kj, in some a-affine coordinate system. 

k=l 

Proof: Since the two connections are flat on the extend manifold A4, the- 
orem 3.2 tell us that there exist dual coordinate systems {0,r]) such that 
y IS V^-affine and r] is "^-affine. Using lemma |3.3| , we know that the 
function ^0(6*) = Y^Tro"(6') satisfies 

and also that _ 

n- - (33) 

is a another V'^~"^-affine coordinate system for M. Therefore, the coordi- 
nate systems {rj) and (77) are related by an affine transformation, so there 
must exist a matrix M and numbers (ai, . . . , a^+i) such that 

n+l 

r^, = Y,Mf^k + ai. (34) 

k=l 



But from theorem 3.1, there exists a potential function '^{0) such that 

and 

_ d^{9) 

Equation (|3^) then gives 

k=l 

and differentiating this equation with respect to 0^ leads to 

k=l k=l 
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4 The condition of monotonicity 



We have seen m the previous section that requiring duahty between the V*^"^-* 
and V*-~"^ connections reduces the set of possible Riemannian metrics on Xi 



to matrix multiples of the WYD metric. Following |13], we now investigate 
the effect of imposing a monotonicity property on this set. 

Recall that the — 1-representation is the limiting case a = — 1 of the 



a-representations defined in section 2.1. If we use it to define to define a 



Riemannian metric ^ on by means of the inner product (•, •)p in A C 
B{T-i^), then we say that g is monotone if and only if 

<(A(-^),yl(-^)) , (36) 

\ / S(p) \ I p 

for every p ^ M., A ^ TpM., and every completely positive, trace preserving 
map S : A ^ A. 

For any metric g on TAi, define the positive (super) operator K^^ on A 

by 

gAlB) = {a(-'\k^ (^^"'0).. = {^^''^^^ (^^"'0) • 

Note that our K is denoted K^^ by Petz in |25]. Define also the (super) 
operators, L^^X := aX and R^X := Xa, for X ^ A, which are also positive. 
The aforementioned characterization of monotone metrics obtained by Petz 
is the content of the following theorem. 

Theorem 4.1 (Petz 96) A Riemannian metric g on A is monotone if and 
only if 



where is defined in and f : R'^ — > R^ is an operator monotone 
function satisfying f{t) = tf{t~^). 

In particular, the WYD metric is monotone and its corresponding oper- 
ator monotone function is 

_ p{l-p){x-lf 

forp = ^ 

Combining this characterization with our theorem ( |3.4D , we obtain the 
following improved uniqueness result. 
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Theorem 4.2 // the connections V*-"^ and V^""^-* are dual with respect to 
a monotone Riemannian metric g on A4, then g is a scalar multiple of the 
WYD metric. 

Proof: Let 9 = {9^, . . . , 9^) be the V'-^^-afhne coordinate system of theorem 
0. Given a e M, we have that T^^M ~ A. In particular, • • • , ^} 

is the basis for A obtained as the — 1-representation of {^t; ■ ■ ■ , gf" }■ 
let and K^"^ be the kernels of g and g^"^ , respectively. Then it follows 



from theorem 3.4 that 



— K9 ( \ - - ( — —\ - r ^ 



89^' 89^ 1 1 \ 89^ ' 890 

n+l 



fc=i 

n+l 



fc=i 

n+l 



/ 8 8 
\89^' 893 



(39) 



Thus, as operators on A^ the kernels and K^'^^ are related by 

Kl = MK^^\ (40) 

Therefore, if and are the operator monotone functions corresponding 
respectively to g and (7'-"^ from theorem 4.1, we have 



RV^P{UR-')RV^) = M[Rl/'f(-\L^R~')Rl/' 
Rl/'f^{L.R-')R'J')M = {rI'^ f^-\L^R-^)Ry^ 

M = fS{L^R-Y'f^''HLaR-'), 

as everything commutes. Thus, the operator M is given as a function of the 
operator LaR~^, but it is itself independent of the point a, so we conclude 
that it must be a scalar multiple of the identity operator. 

5 Discussion 

With the result of this paper, we have completed the programme initiated 
in [O] of characterizing the BKM and the WYD metrics in terms of the 
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combining requirement of monotonicity and duality. The monotonicity con- 
dition has an appeahng motivation coming from estimation theory. If we 
interpret the geodesic distance between two density matrices as a measure 
of their statistical distinguishability, then ( |36|) tells us that they will become 
less distinguishable if we introduce randomness into the system under con- 
sideration. In other words, their distance decreases under coarse-graining. 

As it is, estimation theory is more basic than physics itself, since it 
does not assume any particular underlying physical process, being just a 
tool to help analyze statistical data. Nevertheless, the interpretation above 
carries over to statistical mechanical systems as well, where stochastic (i.e 
completely positive, trace-preserving) maps appear as a mathematical im- 
plementation of the time evolution of a system whose states are described 



by density matrices |29|. In this case, monotonicity means that the distance 
between different states decreases under the same time evolution. If it de- 
creases asymptotically to zero for any two points in a certain set of 'initial' 
states, then we are in the presence of a fixed point for the dynamics, or in 
other words, an equilibrium state. From all this, it seems that imposing a 
monotonicity condition on the possible Riemannian metrics on a statistical 
manifold is not at all an artificial technicality. 

Our motivation behind Amari's duality is less general and ultimately 
rests upon quantum statistical mechanics alone |31, pO]. Recall that the von 




Neumann entropy for a state p G A4 is defined as 

S{p):=-Triplogp) (41) 

and that the relative (Kullback-Leibler) entropy of the state p given the 
state a is 

S{p\a)=Tv[p{logp-loga)] (42) 

Now let us choose a set of m < n observables Yi, . . . , such that the set 
{1, Yi, . . . , Yjn} is a basis for A. Among all possible observables in these 
ones represent the slow variables of the theory, that is, those whose means 
we can measure at any given time. Then it is an easy exercise, using the 
Lagrange multipliers technique, to show that the states which maximize the 
von Neumann entropy subject to keeping the means of all {li}, « = 1, . . . , m, 
constant are the Gibbs states of the form 

p = exp {e^Yi + • • • + e'^Y^ - mi) , (43) 

where ^{9) is determined by the normalization condition Trp = 1. For 
example, if Yi = H is the energy operator, then we obtain the so called 
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canonical ensemble, whereas if we have Yi = H,Y2 = N where N is the 
number of particles, we get the grand canonical ensemble. We immediately 
recognize these states as constituting a V'-^^-flat, m-dimensional, submani- 
fold Sm C S, which is determined by our choice of Yi, . . . , Ym, that is, by 
our choice of the level of description adopted. 

Inasmuch as entropy is negative information, the principle of maximum 
entropy, advocated in information theory and statistical physics by Jaynes 



[18, tells us that, if the only information available about the system 
under consideration are the means of the random variables Yi, . . . , Ym, then 
we should take as the state of the system the element in Sm with. tli6S6 
means. The replacement of the true state /? G 5 by the one in S^fi with the 
same means for Yi , . . . , Ym is a reflection of our ignorance of what really 
goes on with the system. It is the least biased choice of state given the 
information available. 

The point of view in statistical dynamics |29| is somewhat different, in 
the sense that it regards the same replacement as part of the true dynamics of 
the system. For instance, the heat transfer in a local region of a fluid happens 
10^ times faster then most chemical reactions [§, so we can choose to regard 
the concentrations of the chemicals reacting as the slow variables while all 
other observables are thermalized (maximum entropy) along each time step 
in the dynamics. The skill of the scientist using statistical dynamics thus 
resides in correctly identifying which are the slow variables of the problem 
at hand and then following the time evolution of the system, which involves, 
apart from a stochastic dynamics particular to each problem, successive 
projections onto Sm- 

Information geometry provides a mathematical meaning for this projec- 
tion m, |3l| . It is well known that the relative entropy (^2|) is the statistical 
divergence associated with the dualistic triple (5^, V^"^\ V*^~^^) [24|. It then 
follows from the general theory |^] that, given an arbitrary point /O € 5, the 
point in Sm (which is V(^)-flat) that minimizes S{p\a) is obtained uniquely 
by following a — 1-geodesic from p that intercepts S orthogonally with re- 
spect to the BKM m.eix\c . This is equivalent to the projection described 
above (maximum entropy subject to constant means) precisely because a 
path preserving the mean parameters (or mixture coordinates) is a — 1- 
geodesic, that is, a straight line for the mixture connection. 

However, if g' is a general monotone metric, with respect to which V*-^-* 
and V*-~^^ are not necessarily dual, then the relative entropy might fail to 
be a divergence for ((7, V*^^\ V^^^-') and nothing guarantees that minimizing 
S{p\a) will produce a point in Sm connected to p by a —1-geodesic intersect- 
ing Sm perpendicularly with respect to g. Information geometry no longer 
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provides a mathematical implementation for statistical dynamics anymore. 

As a final word for this paper, let us mention that a corollary to the- 
orem 4.2 is the fact that the relation (Q) does not hold for the quantum 
a-connections defined using the a-representations as in section ||. If it did, 
a simple calculation shows that V^") and V*-""^ would then be dual with 
respect to the ^iiTM metric (since the itl-connections are). But from theo- 
rem 4.2, this would imply that the BKM is a scalar multiple of the WYD, 
which is only true in the extreme cases a = ±1. 
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